Compactness score: a fast filter method for unsupervised feature selection
نویسندگان
چکیده
The rapid development of big data era incurs the generation huge amount day by in various fields. Due to large-scale and high-dimensional characteristics these data, it is often difficult achieve better decision-making practical applications. Therefore, an efficient analytical method urgently necessary. For feature engineering, selection seems be important research topic which anticipated select “excellent” features from candidate ones. implementation can not only purpose dimensionality reduction, but also improve computational efficiency result performance model. In many classification tasks, researchers found that seem usually close each other if they are same class; thus, local compactness great importance for evaluation a feature. Based on this discovery, we propose fast unsupervised algorithm, named Compactness Score (CSUFS), desired features. To prove superiority proposed several public sets considered with extensive experiments being performed. presented applying subsets selected through different algorithms clustering task. tasks indicated two well-known metrics, while reflected corresponding running time. As demonstrated, our algorithm more accurate compared existing
منابع مشابه
Hierarchical fuzzy filter method for unsupervised feature selection
The problem of feature selection has long been an active research topic within statistics and pattern recognition. So far, most methods of feature selection focus on supervised data where class information is available. For unsupervised data, the related methods of feature selection are few. The presented article demonstrates a way of unsupervised feature selection, which is a two-level filter ...
متن کاملA Filter Feature Selection Method for Clustering
High dimensionnal data is a challenge for the KDD community. Feature Selection (FS) is an efficient preprocessing step for dimensionnality reduction thanks to the removal of redundant and/or noisy features. Few and mostly recent FS methods have been proposed for clustering. Furthermore, most of them are ”wrapper” methods that require the use of clustering algorithms for evaluating the selected ...
متن کاملFast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets
Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...
متن کاملConstraint Score: A new filter method for feature selection with pairwise constraints
Feature selection is an important preprocessing step in mining high-dimensional data. Generally, supervised feature selection methods with supervision information are superior to unsupervised ones without supervision information. In the literature, nearly all existing supervised feature selection methods use class labels as supervision information. In this paper, we propose to use another form ...
متن کاملA hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Annals of Operations Research
سال: 2023
ISSN: ['1572-9338', '0254-5330']
DOI: https://doi.org/10.1007/s10479-023-05271-z